Learning from a Neighbor: Adapting a Japanese Parser for Korean Through Feature Transfer Learning
نویسندگان
چکیده
We present a new dependency parsing method for Korean applying cross-lingual transfer learning and domain adaptation techniques. Unlike existing transfer learning methods relying on aligned corpora or bilingual lexicons, we propose a feature transfer learning method with minimal supervision, which adapts an existing parser to the target language by transferring the features for the source language to the target language. Specifically, we utilize the Triplet/Quadruplet Model, a hybrid parsing algorithm for Japanese, and apply a delexicalized feature transfer for Korean. Experiments with Penn Korean Treebank show that even using only the transferred features from Japanese achieves a high accuracy (81.6%) for Korean dependency parsing. Further improvements were obtained when a small annotated Korean corpus was combined with the Japanese training corpus, confirming that efficient crosslingual transfer learning can be achieved without expensive linguistic resources.
منابع مشابه
A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملImage alignment via kernelized feature learning
Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...
متن کاملImage Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملPhonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish
Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...
متن کاملPhonological Awareness Impact on Articulatory Accuracy of the Spanish Liquid [r] in Japanese FL Learners of Spanish
Foreign language learners tend to avoid phonological difficulties and simply transfer sounds whether from their L1 or any pre-existing L2. Phonological awareness (PA) gives students an active role in understanding their own potential in improving pronunciation through several methods. However, such methods are likely to be restricted to only passive learning methods, such as repetition, reading...
متن کامل